In [1]:
import pandas as pd
import seaborn as sns
import plotly.express as px

import matplotlib.pyplot as plt
In [2]:
import plotly.io as pio
pio.renderers.default = "plotly_mimetype+notebook"

Matplotlib¶

For this excercise, we have written the following code to load the stock dataset built into plotly express.

In [3]:
stocks = px.data.stocks()
stocks.head()
"""ewa"""
Out[3]:
'ewa'

Question 1:¶

Select a stock and create a suitable plot for it. Make sure the plot is readable with relevant information, such as date, values.

In [4]:
# YOUR CODE HERE
fig, ax = plt.subplots()
ax.plot(stocks.date, stocks.GOOG)
ax.xaxis.set_major_locator(plt.MaxNLocator(5))
fig.show()
/var/folders/90/9v48d7qs3mbfrbykf9mxnwqh0000gn/T/ipykernel_10365/1666324507.py:5: UserWarning:

Matplotlib is currently using module://matplotlib_inline.backend_inline, which is a non-GUI backend, so cannot show the figure.

Question 2:¶

You've already plot data from one stock. It is possible to plot multiples of them to support comparison.
To highlight different lines, customise line styles, markers, colors and include a legend to the plot.

In [5]:
# YOUR CODE HERE
fig, ax = plt.subplots()

#for st in stocks.columns:
#    if not st == "date":
#        ax.plot(stocks.date, stocks[st], label=st)
ax.plot(stocks.date, stocks.GOOG,label="GOOG", linestyle='dashdot', color='r')
ax.plot(stocks.date, stocks.AAPL,label="AAPL", linestyle='solid', color='b')
ax.plot(stocks.date, stocks.AMZN,label="AMZN", linestyle='dotted', color='y')
ax.plot(stocks.date, stocks.FB,label="FB", linestyle='dashed', color='g')
ax.plot(stocks.date, stocks.MSFT,label="MSFT")

ax.xaxis.set_major_locator(plt.MaxNLocator(5))
ax.legend()
fig.show()
/var/folders/90/9v48d7qs3mbfrbykf9mxnwqh0000gn/T/ipykernel_10365/2566576305.py:15: UserWarning:

Matplotlib is currently using module://matplotlib_inline.backend_inline, which is a non-GUI backend, so cannot show the figure.

Seaborn¶

First, load the tips dataset

In [6]:
tips = sns.load_dataset('tips')
tips.head()
Out[6]:
total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4

Question 3:¶

Let's explore this dataset. Pose a question and create a plot that support drawing answers for your question.

Some possible questions:

  • Are there differences between male and female when it comes to giving tips?
  • What attribute correlate the most with tip?
In [7]:
# YOUR CODE HERE
#Question: on which day is the relatively highest tip given?

tips['relative'] = tips.tip/tips.total_bill
sns.boxplot(data=tips, x="day", y="relative")
Out[7]:
<AxesSubplot:xlabel='day', ylabel='relative'>

Plotly Express¶

Question 4:¶

Redo the above exercises (challenges 2 & 3) with plotly express. Create diagrams which you can interact with.

The stocks dataset¶

Hints:

  • Turn stocks dataframe into a structure that can be picked up easily with plotly express
In [8]:
allstocks = ["AAPL", "AMZN", "FB"]

fig.data[0].line.color = 'rgb(204, 20, 204)'

fig = px.line(stocks, x='date', y=allstocks)

fig.show()
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Input In [8], in <cell line: 3>()
      1 allstocks = ["AAPL", "AMZN", "FB"]
----> 3 fig.data[0].line.color = 'rgb(204, 20, 204)'
      5 fig = px.line(stocks, x='date', y=allstocks)
      7 fig.show()

AttributeError: 'Figure' object has no attribute 'data'

The tips dataset¶

In [9]:
#Question: what is the average bill custumers get when eating out?

tipsdataset = px.data.tips()
fig = px.box(tipsdataset, y="total_bill")
fig.show()

Question 5:¶

Recreate the barplot below that shows the population of different continents for the year 2007.

Hints:

  • Extract the 2007 year data from the dataframe. You have to process the data accordingly
  • use plotly bar
  • Add different colors for different continents
  • Sort the order of the continent for the visualisation. Use axis layout setting
  • Add text to each bar that represents the population
In [10]:
#load data
df = px.data.gapminder()
df.head()
Out[10]:
country continent year lifeExp pop gdpPercap iso_alpha iso_num
0 Afghanistan Asia 1952 28.801 8425333 779.445314 AFG 4
1 Afghanistan Asia 1957 30.332 9240934 820.853030 AFG 4
2 Afghanistan Asia 1962 31.997 10267083 853.100710 AFG 4
3 Afghanistan Asia 1967 34.020 11537966 836.197138 AFG 4
4 Afghanistan Asia 1972 36.088 13079460 739.981106 AFG 4
In [11]:
# YOUR CODE HERE

year2007 = df.query("year == 2007")
fig = px.bar(year2007, x='continent', y='pop', color='continent', text='continent')

fig.update_xaxes(categoryorder='total descending')

fig.show()
In [ ]: